Text Mining Scientific Papers: A Survey on FCA-Based Information Retrieval Research

نویسندگان

  • Jonas Poelmans
  • Dmitry I. Ignatov
  • Stijn Viaene
  • Guido Dedene
  • Sergei O. Kuznetsov
چکیده

Formal Concept Analysis (FCA) is an unsupervised clustering technique and many scientific papers are devoted to applying FCA in Information Retrieval (IR) research. We collected 103 papers published between 2003-2009 which mention FCA and information retrieval in the abstract, title or keywords. Using a prototype of our FCA-based toolset CORDIET, we converted the pdffiles containing the papers to plain text, indexed them with Lucene using a thesaurus containing terms related to FCA research and then created the concept lattice shown in this paper. We visualized, analyzed and explored the literature with concept lattices and discovered multiple interesting research streams in IR of which we give an extensive overview. The core contributions of this paper are the innovative application of FCA to the text mining of scientific papers and the survey of the FCA-based IR research.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Formal Concept Analysis in Knowledge Discovery: A Survey

In this paper, we analyze the literature on Formal Concept Analysis (FCA) using FCA. We collected 702 papers published between 2003-2009 mentioning Formal Concept Analysis in the abstract. We developed a knowledge browsing environment to support our literature analysis process. The pdf-files containing the papers were converted to plain text and indexed by Lucene using a thesaurus containing te...

متن کامل

Using Concept Lattices for Text Retrieval and Mining

The potentials of formal concept analysis (FCA) for information retrieval (IR) have been highlighted by a number of research studies since its inception. With the proliferation of small-size specialised text databases available in electronic format and the advent of Web-based graphical interfaces, FCA has then become even more appealing and practical for searching text collections. The main adv...

متن کامل

FCA and IR: The Story So Far

The application of Formal Concept Analysis (FCA) to Information Retrieval (IR) is twenty-five years old. Over this period, a number of papers have explored the potentials of FCA for various information finding tasks while several system prototypes have been made available for experimentation and testing. In this talk we survey what has been achieved so far, discussing lessons and implications f...

متن کامل

FCAIR 2012 – Formal Concept Analysis Meets

The application of Formal Concept Analysis (FCA) to Information Retrieval (IR) is twenty-five years old. Over this period, a number of papers have explored the potentials of FCA for various information finding tasks while several system prototypes have been made available for experimentation and testing. In this talk we survey what has been achieved so far, discussing lessons and implications f...

متن کامل

Topic Modeling and Classification of Cyberspace Papers Using Text Mining

The global cyberspace networks provide individuals with platforms to can interact, exchange ideas, share information, provide social support, conduct business, create artistic media, play games, engage in political discussions, and many more. The term cyberspace has become a conventional means to describe anything associated with the Internet and the diverse Internet culture. In fact, cyberspac...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011